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Abstract 

In the on-line Explore Sz Exploit literature, central to Machine Learning, a central 
planner is faced with a set of alternatives, each yielding some unknown reward. The 
planner’s goal is to learn the optimal alternative as soon as possible, via experimen¬ 
tation. A typical assumption in this model is that the planner has full control over 
the experiment design and implementation. When experiments are implemented by a 
society of self-motivated agents the planner can only recommend experimentation but 
has no power to enforce it. Kremer et. al. [9] introduce the first study of explore 
and exploit schemes that account for agents’ incentives. In their model it is implicitly 
assumed that agents do not see nor communicate with each other. Their main result is 
a characterization of an optimal explore and exploit scheme. In this work we extend [9] 
by adding a layer of a social network according to which agents can observe each other. 
It turns out that when observability is factored in the scheme proposed by Kremer et. 
al. is no longer incentive compatible. In our main result we provide a tight bound on 
how many other agents can each agent observe and still have an incentive-compatible 
algorithm and asymptotically optimal outcome. More technically, for a setting with 
N agents where the number of nodes with degree greater than A" is bounded by 
and 2a + (5 <1 we construct incentive-compatible asymptotically optimal mechanism. 
The bound 2a + fi <1 is shown to be tight. 


1 Introduction 

In a variety of settings members of a society are faced with a set of possible actions which 
rewards are unknown. Each agent chooses her action and social learning may entail that 
many will choose the optimal action. This may be the case for selecting among alternative 
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routes for traveling from one location to another, choosing a holiday destination, choosing 
a service provider (e.g., an accountant or an ISP) and more. In such settings the a-priori 
optimal alternative may often be a-posteriori inferior but nevertheless, as no one would like 
to experiment with an a-priori inferior alternative, society might converge on the wrong 
action resulting in a market failure. 

A similar dilemma is central to on-line Explore & Exploit paradigm [E&E], a rich research 
area in Machine learning [ML] [6]. In that setting, a central planner wants to learn the 
optimal action as soon as possible. To do so he can try out various actions and based on 
the history of results decide on whether to continue with experimentation or to exploit his 
knowledge. In this literature the central planner has full control over the experiment design 
and the history of results. A natural question is how this E&E paradigm works out when 
the experiments are actually controlled by self motivated agents and the central planner can 
only make recommendations for actions. 

When modeling society of agents where actions are taken in a decentralized fashion one 
typically accounts for some of the key aspects of the society. Such key aspects include the 
incentive structure of the individuals, the communication structure among the agents and 
the private prior information agents may hold. These three modeling ingredients are central 
to the literature on social learning, where agents have initial conflicting beliefs on the optimal 
action but learn from each other while taking actions simultaneously and repeatedly. This 
literature has its roots in Aumann’s agreeing to disagree |3] and has been later extended in a 
variety of papers (e.g.,[TlE3l[l2lE]). The literature on herding also studies how these three 
components interact. In the herding literature, similar to the E&E setting above, agents act 
sequentially and each agent acts once. Initially, more emphasis has been given in the herding 
literature to the subjective information structure (e.g., |11[I51[51[2])0 However, recently the 
importance of the social network structure has been acknowledged (see [T] for a discussion of 
the observability assumption). In the herding literature one may also observe market failure 
despite of the fact that the collective holds the information required for making the optimal 
choice. 

The hrst paper to marry the social aspects with the challenge of E&E, a new research 
domain for which we coin the term ‘social explore and exploit’, is Kremer et. ah [9]. In that 
work the authors introduce a naive setting and study optimal explore and exploit schemes 
that account for agents’ incentivesJl The following example is useful to understand the model 
of social explore and exploit due to Kremer et. al. 

Example 1.1. Assume there are two routes, denoted T(rain) and R(oad) from point A to 
point B. The latencies in both alternatives are known to change on a daily basis. On each 
given day the travel time on R is a constant that is sampled from a uniform distribution on 
the interval [0,6] (with a mean of 3 hours) whereas T is uniformly distributed on the interval 
[1, 3] (with a mean of 2 hours). A benevolent dictator would like to make sure most agents take 

^The mentioned work employ a game-theoretic setup. When restricted to convention evolution in pure 
coordination games, other aspects such as the design of adaptation rules come into play [Hi- 

^We use the notion of a ‘naive setting’ for settings where the optimal non-social explore and exploit 
scheme is trivial - try all actions sequentially, each once, and settle on the optimal one thereafter. 
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the faster option on any given day. To do so he would dictate to the first arriving agent each 
day to use option R and to the second one to use T after which he would surely know which 
is the better option on that day. Hence, as of the third agent he would dictate the a-posteriori 
optimal action. On the other hand, without any central mechanism, no self-motivated agent 
will try alternative R and so with a high probability all agents will use the inferior option 
each day (with probability Prob{R < T) = ^ to be precise). In the ’social E&E’ setting we 
introduce a central planner (one can think of it as a recommendation engine), which observes 
the outcome of the agents and makes a non-enforceable recommendation to subsequent agents 
on which action to take. Kremer et. al. show that the introduction of such a central planner 
can lead to choosing the optimal action even when agents are self motivated. 

To be more specific, Kremer et. al. identify an incentive compatible scheme with which 
a central planner can asymptotically steer the users towards taking the optimal action. This 
exciting news has been extended in [10] to several more elaborate bandit settings and to 
additional optimization criteria such as regret minimization. An implicit assumption in 
both papers is that agents cannot see each other’s behavior. This assumption turns out to 
be critical. In fact, even with very little observability, for example when each agent just sees 
his predecessor, the schemes proposed in both works cease to be incentive compatible and 
lead to market failure. In this work we investigate the conditions on the social network which 
allow for asymptotically optimal outcomes. Thus, we extend [HI by adding the additional 
layer of a social network and show conditions under which the essence of their results, albeit 
with a different mechanism, can still be maintained even though agents may observe each 
other. 

Needless to say, the ability to observe peers’ actions is realistic and quite common in 
many applications, ranging from route recommendation to hotel recommendation and trans¬ 
portation recommendation, etc. In any of these recommendation systems some exploration 
may be needed, and no one wishes to be the one to explore so others can benefit, while some 
observability of some others’ recommended actions does exist. 

Technically, we extend the ‘social explore and exploit’ setting and incorporate the vis¬ 
ibility of actions of peers in a social network, creating a more complete initial theory of 
economic recommendation systems. We incorporate into the model of [S] a notion of visi¬ 
bility, captured by a visibility graph. In the visibility graph agents are nodes, and an edge 
(a, b) implies that agents a and b can observe each other’s action. Our main result is that 
for a setting with N agents, when the number of nodes with degree greater than is 
bounded by N^, where 2a-{- (3 < 1, there exists an incentive-compatible algorithm leading to 
asymptotically optimal outcome. In other words, there exists a recommendation mechanism 
which agents will gladly follow and which ensures that a vanishing proportion of the agents 
take a sub-optimal action. We also show that the result is tight, in the sense that there is 
a visibility graph, where 2a fi = 1 (In particular the complete graph) and approximately 
optimal outcome can not be obtained by any incentive-compatible algorithm. 
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2 Model 


We consider a setting where agents arrive sequentially and must choose an action from a 
hnite set A. The reward from taking action a G A is given by a commonly known non- 
atomic random variable 14 taking values in some interval I = [L, R] and is the same across 
all agents. Agents would like to maximize their value. At stage 0 neither the agents nor the 
social planner know the value of the random variables {14}aeA- When agent n arrives the 
social planner sees the choices made by agents 1,..., n —1 and the corresponding rewards and 
chooses to communicate some message, m G M (a recommendation), to the agent. Based on 
this message, as well as other information available to the agent, she chooses some action. 

This extra information available to each agent is the actions chosen by some of his 
predecessors, those that he can observe. Let M denote the set of N agents and for each 
n G AA let B{n) G N \ {n\ denote the set of friends of n. The agent arriving at stage t gets 
to see the action (but not the reward) chosen by the subset of his friends that preceded him. 
Coupled with the message received from the social planner he must choose his action. 

Formally, The strategy of the social planner at time n is a function M” : (Ax /)"’“^ —)■ 

Note that once some agent chooses the action a the realization of 14 is known thereafter to 
the social planner. Let agent n be the t-th agent arriving and let agents {1, 2,..., t — 1} be 
his predecessors, then the strategy of agent n depends on the message communicated to him 
by the planner and the actions taken by the set of agents B^{n) = B{n) nil, 2,..., t — 1}. 
That is, n’s strategy is represented by some function ct„ : x M —)• Ao 

The goal of each agent is to maximize her expected reward while the goal of the planner is 
to maximize the expected average reward, or the expected proportion of agents who choose 
an ex-post optimal action. Note that the social planner would like to induce agents to 
experiment with previously untried actions. Ideally, the social planner would like each of the 
first |A| agents to experiment with a different action. Once that happens she can ensure all 
subsequent agents take the optimal action and asymptotically maximize the average reward. 
On the other hand, agents are short-sighted and do not want to experiment with ex-ante 
inferior actions. This tension is at the core of the analysis we provide. 

Hereinafter we assume the set of alternatives is binary, |A| = {a,b}. Let be the 

expectations of 14,14 correspondingly and assume, without loss of generality, that /Xq > 
fib- In what follows, similar to Kremer et al [9], we provide a direct mechanism (one for 
which M = A) which is incentive compatible (IC) and asymptotically optimal. That is, 
agents comply with the mechanism’s recommendation in equilibrium and only a vanishing 
proportion of players take the inferior action, for large enough N. 

•^Restricting the planner to pure strategies is done for the sake of simplicity only. It is easy to see that 
each of the arguments in the following sections holds true when the planner is also allowed to use mixed 
strategies, and that the resulting optimal strategy of the planner is pure. 

^Our model assumes that agents know their position in the sequel. However the results reported here 
extend to the case where agents arrive randomly and do not know their position in the sequel. 
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2.1 The No Visibility Case 


We begin with the exact setting of Kremer et al |9] and assume no visibility across agents 
(‘blind’ agents). Formally this is the setting where B{n) = 0 Vn. The mechanism we 
formulate is different than that introduced in |9]. Whereas we do not know how to adapt 
the Kremer et al mechanism to the case with visibility ours can be adapted, as we do in 
subsequent sections. 

The underlying idea in the following construction is for the mechanism to recommend 
action a to the hrst agent, who happily complies. Thereafter the mechanism knows the value 
of Va- The mechanism commits up front to a hnite set of agents to recommend action b only 
when Va falls into some pre-dehned set. For this to be incentive compatible the expected 
value of Va, conditional on that set (which is the expectation of W from the perspective of 
the agent that is recommended h) must not be greater than If the mechanism can ensure 
that the aforementioned sets cover all the possible realizations of Va then surely one agent 
at least will try action b. Note that these sets need not be disjoint and so it might be the 
case the more than one agent is recommended b. 

These sets will be induced by some partition of the interval I which is dehned in the next 
Lemma: 

Lemma 2.1. There exists a finite partition of [L,R] such that Dq = (L,/ife), 

E{Va\DoUDk) = /Xfe VI < /c < K-1 andE{Va\DoUDK) < fib- Furthermore, p(Va&Do)lij^-E(Va\Do)] 

I < K < _ — _^_k 2 

- - piVaeDo)[f^t-E{Va\Do)] ^ ^ 


Proof Consider the function /i(x) = E{Va\DQU [fib, fib + x]) dehned for any nonnegative 
X. Note that /i(0) < fib and f{R — fib) = fJ-a > l^b- As fi is continuous there exists, by the 
intermediate value theorem, a value Xi for which /i(xi) = fib- Set Di = [fib,Xi) (in fact, Xi 
is unique as /i is monotonic. 

If E{ya\DQVJ [fib, R] \Di) < fib then set K = 2 and D 2 = [fib, R] \Di. Otherwise, consider 
the function f 2 {x) = E{Va\DQ \ Hi U [xi,xi + x]) applying the intermediate value theorem 
as before, there must be some X 2 such that / 2 (a^ 2 ) = Tb- Set D 2 = [a;i,a; 2 ). 

Repeat this iteratively: Assume Dj = [xj-i,Xj) have been dehned for j = 1 ,.. ., k — 1 . 
If EiValDo U [xk-i,R]) < fib then set K = k, Dk = [xk-i,R\, and halt. Otherwise, let 
fk{x) = H(I4|Ho \ u’^zlDj U [xk-i,Xk-i + a:]). By applying the intermediate value theorem 
as before, there must be some Xk such that fk{xk) = fib- Set Dk = [xk-i,Xk)- 

We now turn to argue that the above procedure eventually halts. To see this note that 


fib — E{Va\DQ U Dk) 


P{Va e Do)E{Va\Do) + P{Va G Dk)E{Va\Dk) 
P{Va e Do) + P{Va e Dk) 


P{Va E Do)E{Va\Do) + P{Va E Dk)E{Va\Dk) = fib[P{Va E Do) + P{Va E Dk)] ^ 
P{Va E Dk)[E{Va\Dk) - fib] = P{Va E Do)[fib " E{Va\Do)] => 

P{Va E Dk)E{Va\Dk) > P{Va E Do)[fib " E{Va\Do)]. 
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Note that the right hand side of the last inequality is some positive number, 6 , independent 
of k and also that E(ya\Dk) < R- Hence P(14 ^ D^) > 

Summing over fc: 1 > G -Dfc) > J2k=o ^ which implies that K < j 

Let us now compute the upper bound on the value of K. 

Note that since E{Va\DQ U Dk-i) > -E(K|-Do U Dk) A > -^(Kl-Di^-i) we can 

conclude that p(14 G Dk) < piVa e Dk-i). As p{Va G Dk) + p(14 e -D^-i) < 1 we get 
piVa G Dk) < Therefore, from the above we can conclude that: 

e Po u Dk)E{Va\Do U Dk) = Pb[{K - 1)P(K e Do) + P{Va G Pi U ... U Dk-i)] = 
Ph[{K - 2)P(K G Po) + 1 - P{Va G Dk) > fib[{K - 2)P(K e Pq) + ^]. (1) 

On the other hand we may substitute P(14|PoUPfc) with \d,) ^ 

and so: 


^kDiPiVa G Po u Dk)E{Va\Do U Dk) = 

^k=MVa e Po)P(K|Po) +p(K e Pfc)P(K|Pfc) = 

{K - l)p{Va G Po)P(14|Po) + ^k=lPiya e Pfc)P(14|Pfc) = 

{K - 2)P{Va G Po)P(K|Po) + ha - G Pk)P(V;|Px) < 

{K - 2)P{Va G Po)P(K|Po) + ha (2) 

From equations 12.11 and 12.11 we get: 

{K - 2)P(K e Po)P(K|Po) + ha > [(P - 2)P(K e Po) + ^]h6 ^ 

Zi — — 

P < - — -^-+ 2. 

- p{VaeDo)[pb-E{Va\Do)] 

Finally, we also compute a lower bound on the value of kE 

^tiP{Va e Po u Dk)E{Va\Do U Dk) < pb[{K)P{Va G Po) + P(Ha G Pi U ... U Dk)] = 

/i4(P - 1)P(14 G Po) + 1] (3) 

On the other hand: 


^tiPiVa G Po u Dk)E{Va\Do U Pfc) = J^tiPiVa G Po)P(K|Po) + P(K e Dk)E{Va\Dk) = 

(4) 

^Note that we make use for this lower bound when we study high visibility graphs in section [T3l 
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= Kp{Va e Do)EiVa\Do) + ^tlPiVa ^ D,)E{Va\Dk) = {K- 1)P(K e Do)EiVa\Do) + Pa 
Combining equations 12.11 and HI we get: 

{K - l)P{Va e Do)E{Va\Do) + Pa< [{K - 1)P(K G ^o) + l]Pb => 

y _ Pa- ~ Pb _^ , 

- p{VaeDo)[pb-E{Va\Do)] ■ 

Q.E.D 

A direct revelation mechanism is a mechanism for which the message space equals the 
action space, M = {a,b}. Given the partition we dehne the following direct 

revelation mechanism for our social planner: 


No Visibility Mechanism: 


• If P(14 < Pb) = 0 then set M"' = a Vn. 


• If P(14 < k-b) > 0 then: 


1. set = a 


2. For n = 2..., K + 1 let = b whenever 14 € (Pq U P^-i) 

otherwise. 

and M"- = a 

3. Let c G A be the best action among those chosen by agents 1,. 
any agent n > K + 1 set = c. 

.. ,K +1. For 


We now turn to argue that the No Visibility Mechanism is incentive compatible, that is 
each agent will use the action that is recommended to him by the planner. Hence one of the 
agents 2,..., P + 1 will surely try action b. This, in turn, implies that all agents n > K + 1 
will be recommended the optimal action. 

Theorem 2.2. The No Visibility Mechanism is incentive compatible 

Proof: Since pa > Pb the hrst agent will clearly comply with the social planner’s recom¬ 
mendation to take action a. 

For each agent 2 < j < k + 1: 

• Note that the event = b is the same as the event 14 £ (Po U Pj-i). Thus, the 
expected reward from taking action a is P(14|(Po U Pj-i) = Pb, which is exactly the 
expected reward from taking action b. Agent j will therefore be indifferent between a 
and b and might as well take action b as the planner recommended. 
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• The event = a is equal the event 14 ^ {DoUDn-i). However as ii^(14|(-DoU-Dn-i) = 
fib < ha we conclude that the expected reward from taking action a, given the message 
= a is E{Va\Va ^ {DUDn-i)) > fib, where the expected reward from taking action 
b is fib- Therefore agent j will prefer action a, as recommended by the planner. 

Recall that U^^qD^ = [L, R] which implies that at least one agent will be recommended, 
and consequently choose, action b. Therefore, for any agent j > K + 1 the planner recom¬ 
mends the optimal action and so agents will comply. 

Q.E.D 

Definition 2.3. Let Uj be the utility of agent j. An incentive compatible direct revelation al- 

— — Uu 

gorithm is asymptotically optimal if Ve > 0, 3N such that \fN > N ,Va,Vb E I : jVmaL(V 4 ) ^ 
1 — e. In words, the average utility of the agents goes to max{Va, 14). 

Corollary 2.4. If P{Va < fib) > 0 then the No Visibility Mechanism is asymptotically 
optimal 

Proof: By Theorem I2.2l the No Visibility Mechanism is incentive compatible and so after 
the hrst K + 1 agents the social planner knows the values of both 14 and I4. This ensures 
that agents k + 2....N will take the optimal action. As K is independent of the total number 
of agents, N, the proportion of agents taking the optimal action increases to one as N grows. 
Q.E.D 

Note that this gives an alternative technique to [9], which can be later generalized to 
address the case of network that allows for visibility. 

2.2 The Medium Visibility Case 

We next turn to study the case where all agents do have some visibility, albeit limited 
visibility. In particular we assume that \B{n)\ < A^" Vn and for some a < 0.5. Unfortunately 
we cannot use the No Visibility Mechanism as it may become non incentive compatible 
whenever B{n) 7^ 0 (at least for the first K agents). We turn to explain the underlying 
reasons we lose the IC(incentive compatible) property: 

1. What happens when k > j, both are in K and j E B{k)l Consider an instance where 
Va E Dk- In that case assuming IC, j will take action a and k will be recommended 
action b. From these two, agent k concludes that I4 ^ (-Do U Dk)\ {Dq U Dj) = D^, in 
which case he will take action a, contradicting IC. 

2. What happens when k > v > j > i, where i,j, k are in K but v is not in K, and both 
i,j are in B{v) while v is in B{k)l Consider an instance where I4 ^ Dk- In that case 
assuming IC, i and j will take action a, v will see that both i and j took action a 
and will take action a as recommended, and k will be recommended action b- But k 
can see that v took action a. However if I4 ^ Dq then assuming IC both j and k will 





take action b and since v can see both of them he can conclude 14 G Dq and ” herd” b. 
Therefore k can conclude 14 G -Dq U \ -Dq = -Dfc in which case he will take action a, 
contradicting IC. 

However, a variant of the No Visibility Mechanism, which we term the Medium Visibility 
Mechanism, works. The way we adapt to the medium visibility case is by choosing the set 
of K agents in a way that they do not see each other, directly or indirectly, which is why the 
original mechanism fails. This will entail an increase in the number of initial agents which 
are not necessarily recommended the optimal action from il' + 1 to a larger number, but 
nevertheless the asymptotic efficiency will still prevail. 

To introduce this variant we use the following notation: For a subset of agents N <Z N, 
B{N) = U^gjYi?(n). In words, B{N) is the set of neighbors of N. 

The main idea behind the following algorithm is to hnd K agents that cannot see each 
other, moreover that there is no possibility that any other agent (outside of those K agents) 
will be able to see two or more agents from this group, so no other agent can reflect the 
group choices to other agents from the group. Note that this algorithm is dynamic and we 
do not need to know the order of arrival in advance. 


Medium Visibility Mechanism; 

• Let M = {a, b} 

• If P{Va < Aife) = 0 then M^ = aVn 

• If P{Va < Hb) > 0 then set = a, p = 0, /c = 0, and iV = 0 

• For n = 2,..., N: 

- While k<K do: 

* If n G {B{p) U B{B{p))} then M'^ = a and iV = iV U {n}. 

* If n ^ {B{p) U B{B{p))} then p = p U {n}, k = k + 1 and 

■ = b whenever 14 ^ (.Do U D^) 

• = a otherwise. 

— If k = K then = argmax„^(I4,14)- 


Note that the above mechanism essentially applies the No Visibility Mechanism to the 
subset p of agents. In the process it ‘ignores’ another set of agents, those that have high 
visibility, and are denoted N. The next lemma argues that the agents in N have limited 
visibility into p: 

Lemma 2.5. Any agent n E N sees at most one agent in p. 
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Proof: Let us assume this is not true and that in fact there exist i,j^p such that i < j 
and i,j G B{n). Let i,j be the first two such agents to satisfy these requirements. Note that 
n & B{i) and j G B{n) which implies that j G B{B{i)). This, in turn, is a contradiction to 
the fact that j G p. 

QED 

Theorem 2.6. The Medium Visibility Mechanism is incentive compatible. 

Proof: Similar to the No Visibility Mechanism, the first agent will get the message a and 
will optimally comply. We now consider 3 cases: an agent in N, an agent in p and agents j 
arriving when k = K. 

1. Consider an agent n G iV who is necessarily recommended action a. Assume all other 
agents follow the recommendation of the mechanism. By Lemma [2.51 \B{n) fl p| < 1. 
Assume \B{n) fl p| = 0 then agent n has no other information above and beyond his 
prior and hence chooses action a as he is recommended. If \B{n) n p| = 1 then the 
agent in p that n observes, say agent j, may have either taken action a or b. In the 
former case n infers that 14 ^ {Dq U Dj) which implies that 14 is better than b and so 
action a is chosen. In the latter case n infers that 14 G {Dq VJ Dj), from which he can 
only conclude that the expected reward in both actions is equal and hence will also 
take action a. 

2. Consider an agent j E p and assume all other agents follow the recommendation of the 
mechanism. According to the Medmm Visibility Mechanism, j ^ B{p\{j}). Therefore, 
all the predecessors observed by agent j have received no information and so provide 
j with no information themselves. Therefore, his expected reward from both actions, 
given the recommendation of the Medium Visibility Mechanism is the same as that of 
agent j +1 in the case B {n) = 0 and a recommendation of the No Visibility Mechanism. 
Incentive compatibility of j follows now from Theorem 12 .21 . 

3. Consider an agent j arriving when k = K and assume all other agents follow the 

recommendation of the mechanism. Recall that = [-^) D] which implies that 

at least one agent from p chose action b. Therefore, the planner recommends agent j 
the optimal action and so he will comply. 

Q.E.D 


Theorem 2.7. If\B{n)\ < N°‘ Vn then the value of the parameter k of the Medium Visibility 
Mechanism terminates in K whenever N > 2(iL—1)A^^“. Furthermore, in that case |pUA^| < 
2{K-1)N^^. 

Proof: Assume the algorithm terminates with a value j < K. By the construction 
\B{p)\ < A^“|p| and so \B{B{p))\ < A^^"|p| at the termination. As |p| = j there are at most 
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j(A^" + 7V^") agents in B{p)VJB{B{p)). And so if there are more than + additional 
agents one must satisfy the conditions required to join p. However, this must hold true for 
any j = 1 ,..., A' — 1 whenever there are initially j + Yl!j=i 3 < 2(A' — 
agents. Hence a contradiction. 

Q.E.D 

Corollary 2.8. if \B{n)\ < Vn and a < 0.5 then the Medium Visibility Mechanism is 
asymptotically optimal 

Proof: By Theorems 12.61 and 12.71 the Medium Visibility Mechanism is incentive compat¬ 
ible for large enough N and so after k = K the social planner knows the values of 14 and Vj,. 
By Theorem 12.71 k = K after at most 2{K — agents, and so at most 2{K — will 

take the inferior action. As K is independent of the total number of agents, N^ and 2a < 1 
the proportion of agents taking the optimal action increases to one as N grows. 

Q.E.D 

2.3 The High Visibility Case 

We next extend our results and mechanisms to the case where a limited number of agents 
may exhibit a high number of neighbors. By this we mean that there exist agents for 
which \B{n)\ > A^“, however there are less than such agents, where a and (3 are non¬ 
negative parameters satisfying 2a + P < 1. Note the the Medium Visibility case satishes this 
restrictions as a < 0.5 and /3 = 0 in the environment. 

The Medium Visibility Mechanism offers a solution when \B{n)\ < N°‘ Vn. However it 
may fail whenever there is even a single agent j where \B{j)\ > N°‘. The failure is due to the 
fact that the algorithm may terminate while /c = 1, in which case the conditions of Theorem 
12.71 are not satished. As an example of such an outcome consider a star shaped graph and 
an arrival order where the central agent arrives lastly 

However, a variant of the No Visibility Mechanism and the Medium Visibility Mechanism 
works. The way we adapt the mechanism is by replicating the set of K sets many times. 
Recall that each of the original K sets was a union of two sets - Dq from the left hand side 
of the mean pb and some Dk from the right hand side of /if,. This construction implies that 
whenever an agent is recommended b but happens to see that some agent before him took 
action a then he can conclude that Va > Pb and refuse to accept the recommendation. To 
remedy this we construct the replicas in such a way that there is no overlap of the left hand 
side of one set from one replica with the left hand side of another set from another replica. 
Thus, if a low visibility agent that was recommended b by the mechanism sees some high 
visibility agent that has taken action a he will not be ‘polluted’ by their action and will 
comply with the recommendation to take action b. To achieve this the mechanism uses a 
given replica of K sets as long as no high visibility agent arrives. When a high visibility 
agent arrives the mechanism moves to the next replica of K sets. 

^Recall the mechanism is not forward looking and hence does not know that the central agent comes last. 
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Let us now turn to the construction. Let {L*o,..., be a partition of Dq = [L, /Xb) 

such that EiVaK e Di) = E{ya\Va e Do) and p(K G D^) = G D,) Vj = 

O,...,Ar/30 

Let {Di,D 2 , ... Yk'} be a partition of the segment [/Xb, D] such that EYaYa ^ -Dq U 
Di)=^^b Vj = 0,...,iV^z = l,...,iL'. 


Lemma 2.9. K' < 


^2-^(iV^ + l)(A; —1) + 2 where K is the number of segments from Lemma 


Proof: The upper bound computed in Lemma [2. II for K now applies to K'. Hence, for 
any set D^. 

K' < _ ^ _2_^ 2 =_ — _?_+ 2 = 

pYa e DDY - EYaYa ^ ^o)] l^lPYa ^ “ D(K|K G Dq)] 


11 — Ek 

2 


(AT/^ + l)- 


/^a 


+ 2 < 


ij — 

H'a 9 


(iV^ + l)(D-l) + 2, 


h-a-h^b' ' p(K e Do)[/Xb - D(v;|14 e Do)] Pa-h^b 

where the last inequality follows from the lower bound on K computed in Lemma 12.11 


Q.E.D 

Fix some parameter a and let T = {n : |D(?x)| < NY- Let S be the remaining set of 
agents and assume ft satisfies l^l < N^. 

The following mechanism is a variant of the Medium Visibility Mechanism. As usual, the 
first agent takes action a and 14 is revealed. At some point an agent is chosen as a candidate 
for a dynamic message (all others get the action a). This agent should not be a (hrst or 
second order) neighbor of any previous such agent. In contrast with the Medium Visibility 
Mechanism, the notion of neighbor we use is a neighbor in the sub-graph induced by the 
set T. The message to this agent depends on whether or not 14 G Dq U D*, where i is the 
counter of the candidates and ^ is the counter for the number of agents from S that have 
appeared so far. The extra trick we use here is that whenever an agent with many neighbors 
arrives we use a new sub-segment of Dq and as a result candidate agents cannot conclude 
anything from observing agents is S. 

Let us denote by Bj-Y = B{j) fl T, the neighbors of j in T, and naturally extend this 
to sets as follows, Bt{N) = U^^piBtY). 


High Visibility MechanismiLet M = {a,b} x {TRUE, EALSE, SPECIAL}. If 
P(14 < /ib) = 0 then M'^ = {a, TRUE) Vn. Otherwise: 

1. set Ml = {a, TRUE) 


^This is feasible as 14 is non-atomic. 
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2. set experiment = TRUE, 2 ; = 0, p = 0, /c = 0, knowledge = FALSE and c = a 

3. For agents n = 2,..., N: 

• if {experiment = TRUE) 

While k<K' do: 

— If n e S' then 

* let z = z + 1 

* if knowledge = TRUE-. 

■ set experiment = FALSE. 

■ set c = argniax„ fe(I4, H)- 
• Mn = (c, FALSE) 

* if knowledge = FALSE-. 

■ Mn = {a,TRUE). 

— If n G T then 

=t= If n ^ Bt{p) U Bt{Bt{p)) then p = p U {n} and k = k + 1. 

■ If 14 G -Dq U Dk then = {b,TRUE) and set knowledge = 
TRUE. 

■ Otherwise = {a, TRUE) 

=t= If n G Bt{p) U Bt{Bt{p)) then Mn = {a, TRUE), 
li k = K' then: 

— if knowledge = TRUE-. 

* set experiment = FALSE. 

* set c = argmax^ fe(I4, H) 

* Mn = (c, FALSE) 

— if knowledge = FALSE-, (special case where I 4 G -Dq ... U Dq^) 

* Mn = {b, SPECIAL) 

* knowledge = TRUE 

• if {experiment = FALSE) then Mn = {c, FALSE) 


Theorem 2.10. The High Visibility Mechanism is incentive compatible^ 

Proof: Similar to the No Visibility Mechanism and the Medium Visibility Mechanism, 
the first agent will get the message (a, FALSE) and will optimally comply. We now consider 
7 cases: 

®As the message space is formally not equal the action space what we clearly mean by IC is that agents 
will comply with the first component of the emessage, which is an action. 


13 




1. Case 1: (experiment = TRUE, k < K', n E S and knowledge = FALSE ): In 
this case the mechanism recommended {a, TRUE). Assume all other agents follow the 
recommendation of the mechanism. Since I4 is not known to the planner it is obvious 
that all the actions that agent n can see are a. And since fia > Lb agent n will take 
action a as recommended. 

2. Case 2: (experiment = TRUE, k < K', n E S and knowledge = TRUE) (the 
hrst arriving agent from S following knowledge = TRUE):Assume all other agents 
follow the recommendation of the mechanism. However in the case the High Visibility 
Mechanism will set experiment = FALSE and compute the optimal action before 
sending the message to n. The message sent to agent n will contain FALSE which 
will inform him that the experiment phase is over. Whenever the flag in the message 
is FALSE (experiment phase over) the planner recommends agent n the optimal action 
and so he will comply. 

3. Case 3: (experiment = TRUE, k < K', n E T and n E p). Assume all other 
agents follow the recommendation of the mechanism. Let us consider the following 
options: 

• Agent n sees at least one agent from S that took action b: Note that Assuming all 
other agents follow the recommendation of the mechanism according to the High 
Visibility Mechanism an agent from S will take action b only if U = max{Va, U) 
and the experiment flag will set to FALSE. In that case, however experiment = 
FALSE (see case 7 below) 

• Agent n does not see any agent from S that took action b: 

— Mn = (a, TRUE) : Given agent n can not see any agent from S that took 
action b, following the proof of theorem 12.61 where j E p will prove incentive 
compatibility. 

— Mn = {b,TRUE) : Whenever an agent from S arrives we use a new sub- 
segment of Dq (change from Eg to Dq'^^) and as a result it is obvious that 
if Va E Dq U Di then he can not see any agent from S that took action b 
and therefore neither the agents from S nor the TRUE flag add him any 
additional information. Therefore following the proof of theorem 12.61 where 
j E p will prove incentive compatibility. 

4. Case 4: (experiment = TRUE, k < K', n E T and n ^ p): Assume all other 
agents follow the recommendation of the mechanism. Let us consider the following 
options: 

• Agent n sees at least one agent from S that took action b: Note that assuming all 
other agents follow the recommendation of the mechanism according to the High 
Visibility Mechanism an agent from S will take action b only if I4 = maxiVa, U) 
and the experiment flag will set to FALSE. In that case, however experiment = 
FALSE (see case 7 below) 
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• Agent n does not see any agent from S that took action h: In that case following 
the proof of theorem 12.61 where n E N will prove incentive compatibility. 

5. Case 5: (experiment = TRUE, k = K', and knowledge = FALSE): In this 

case the mechanism recommended {b, SPECIAL). Assume all other agents follow the 
recommendation of the mechanism. Since = [pfe, R] We get that this special 

case is reached if and only if 14 ^ But E(14|14 £ < /i^. This implies 

that b is at least as good as 14- So agent that receives ’’special” in his recommendation 
will comply. 

6. Case 6: (experiment = TRUE, k = K', and knowledge = TRUE):Assume all 
other agents follow the recommendation of the mechanism. \ik = K' and knowledge = 
TRUE the mechanism hnish the experiment phase by setting experiment = FALSE 
and the message sent to agent n and to all the following agents will contain FALSE. 
Whenever the flag in the message is FALSE (experiment phase over) the planner rec¬ 
ommends agent n the optimal action and so he will comply. 

7. Case 7: (experiment = FALSE); The message sent to the agent will contain 
FALSE. Assume all other agents follow the recommendation of the mechanism. 
Whenever the flag is FALSE (experiment phase over) the planner recommends agent 
n the optimal action and so he will comply. 

Q.E.D 

Theorem 2.11. Fix some parameter a and letT = {n : \B{n)\ < A"}. Let S be the remain¬ 
ing set of agents and assume (3 satisfies lAI < . Then the High Visibility Mechanism will 

Mb 

set experiment = FALSE after at most 3iF( ^°_ agents, where K is the number 

_ _I Ma Mh 

from Lemma \2.1\ 

Proof: 

We can deduce from the proof of Theorem 12.71 that at most 2{K' — 1)A^“ agents from T 
arrive before the parameter k of the High Visibility Mechanism takes the value K'. Since 
I *5'I < Nh there are at most A^ agents from S that arrive before k = K'. Once k = K' 
there could possibly be one extra agent until the parameter experiment takes on the value 
FALSE. Therefore, the High Visibility Mechanism will set experiment = FALSE after less 
than 2{K' — 1)A^" A^ -f 1 agents. However, according to Lemma [231 K' < + 

1)(A — 1) -|- 2. Therefore the maximal number of agents that arrive before experiment = 

Mb 

EALSE is 2{K' - 1)N^^ + A^ + 1 < 2(^^(A^ + 1)(A - 1) + 2 - 1)N^^ + A^ -1- 1 < 

Mb 

3A(^^^^A^+2“). 

^ fiLa-llb ' 

Q.E.D 

Corollary 2.12. Eix some parameter a and let T = {n : \B{n)\ < A“}. Let S be the 
remaining set of agents and assume (3 satisfies |5| < A^. If j3 + 2a < 1 then the High 
Visibility Mechanism is asymptotically optimal 
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Proof: 

As proved in theorem 12.101 and 12.111 the High Visibility Mechanism is incentive compatible 

Mb 

and assure experiment = FALSE after at most 3iF( ^ 7V^~^^*^)agents. And so at most 

Mb 

3A"(will take the inferior action. As K,yia,yib are independent of the total 
number of agents, N, and since (3 + 2a < 1 the proportion of agents taking the optimal 
action increases to one as N grows. Q.E.D 


2.4 The Very High Visibility Case 

As we have seen the High Visibility Mechanism works well with social networks where T = 
{n : \B{n)\ < A^“} and S', the remaining set of agents, satishes [S'! < ^ as long as 

2a + (3 < 1. What happens if the social network fails to satisfy this visibility requirements. 
In our next Theorem we argue, via an example, that IC and efficiency cannot be guaranteed 
for such very high visibility networks. 

In particular we demonstrate the impossibility for the case a = 0 and /3 = 1, which 
implies that all agents can see each other: 

Theorem 2.13. Let B{n) = N for all n and assume A(14|14 < x) > /r?,, where x = inf{j/ : 
ProbiVb < ?/) = 1}. Then no incentive compatible asymptotically efficient mechanism exists^ 

Proof: Assume an IC and asymptotically efficient mechanism exists. Let Wn C / be the 
set for which agent n is the hrst agent that is recommended to experiment with b. As the 
mechanism is asymptotically efficient it must be the case that for any value if 14 < x some 
agent will experiment with b, hence [L, x) C U„114 C /, which implies 

E[Va\Va e U„W„] > E{Va\Va < x) > /Xfe. (5) 

IC, coupled with the fact that agent n can observe all past agents implies E[Va\Va G 114 \ 
< pib for all n. Hence A[14|K G Un(114 \ < fib- Note that U„(lhn \ 

= Unll4 and so A'[14|14 G U„114] < fib, contradicting inequality |5l 

Q.E.D 
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